Picture for Yuhang Zhou

Yuhang Zhou

OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification

Add code
May 31, 2026
Viaarxiv icon

Agentic Recommender System with Hierarchical Belief-State Memory

Add code
May 14, 2026
Viaarxiv icon

Deep Reprogramming Distillation for Medical Foundation Models

Add code
May 06, 2026
Viaarxiv icon

CAR: Query-Guided Confidence-Aware Reranking for Retrieval-Augmented Generation

Add code
May 06, 2026
Viaarxiv icon

Synthetic Sandbox for Training Machine Learning Engineering Agents

Add code
Apr 06, 2026
Viaarxiv icon

LLM-Driven Reasoning for Constraint-Aware Feature Selection in Industrial Systems

Add code
Mar 26, 2026
Viaarxiv icon

LLM-Confidence Reranker: A Training-Free Approach for Enhancing Retrieval-Augmented Generation Systems

Add code
Feb 14, 2026
Viaarxiv icon

EBPO: Empirical Bayes Shrinkage for Stabilizing Group-Relative Policy Optimization

Add code
Feb 05, 2026
Viaarxiv icon

OffSeeker: Online Reinforcement Learning Is Not All You Need for Deep Research Agents

Add code
Jan 26, 2026
Viaarxiv icon

Less is More for RAG: Information Gain Pruning for Generator-Aligned Reranking and Evidence Selection

Add code
Jan 24, 2026
Viaarxiv icon